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Docket No.: 055653-0016 

PATENT 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Application of . Customer Number: 20277 

Randall J. CALISTRI-YEH, et al. : " Confirmation Number: 5090 

Serial No.: 10/823,617 '. Group Art Unit: 2164 

Filed: April 14, 2004 \ Examiner: Sathyanaraya R Pannala 

For: CONSTRUCTION OF TRAINABLE SEMANTIC VECTORS AND CLUSTERINfi 

CLASSIFICATION, AND SEARCHING USING T^ABL^^^SoRS 

INTERVIEW ACffMnA 

Mail Stop AF 
Commissioner for Patents 
P.O. Box 1450 
Alexandria, VA 223 1 3^1 450 

Sir: 

Per the Examiner's request, this Agenda is submitted for the upcoming telephone interview 
to be held on September 12, 2007 at 4:00 pm EST. 

The Examiner is thanked for the favorable indication that all pending claims are in condition 
for allowance if outstanding rejections under 35 US.C. §101 are overcome and claim objections are 
addressed. 

It is respectfully submitted that the rejections under 35 U.S.C. §101 are overcome and 
the claim objections are addressed in view of U»e remarks and proposed amendment 
submitted herewith. Alternatively, the Examiner's guidance and thoughts on permissible 
amendments are respectfully solicited. 

It is submitted that the claims are not directed solely to mere ideas, laws of nature, or natural 
Phenomena. .Each of the claims fails sgu^ into one of the classes of subject matter permitted by 
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35 U.S.C. §101, that is to say pj^ocess or machine, respectively. Independent claims 63 and 64, for 
example, are tied to a data processing system (machine), and independent claims 41, 46 and SO are 
directed to a series of computer-executed steps (process). Independent claims 69 and 70 recite a 
tangible machine-readable medium, in conformity with In re Beauregard, 53 F.3d 1583, 35 
USPQ2d 1383 (Fed. Cir. 1995). According to the Beauregard decision, computer programs 
embodied in a tangible medium, such as floppy diskettes, are statutory subject matter under 35 
U.S.C. §101. 

In addition, these claims are more than "a computer" that "solely calculates a mathematical 
formula" or "a computer disk that solely stores a mathematical formula. Rather, the claimed 
process and system are not mere abstract concept or mathematical formula. Rather, the process and 
system provide useful, concrete and tangible results, in classifying a file or document relative to a 
predetermined number of categories. For instance, claims 41, 46 and 50 are directed to a method of 
classifying , datase t ,s gfe to a p r^rm^ Wfljch is a kev elemftnt fnr 

like search en r ines or databases in identifying , nti/nr . eatlBStMi fi ,». nr H _ 

Techniques described in claim 41 uniquely describes determining trainable semantic vectors for 
each category using sample datasets, and advantageously classifies datasets according to a specific 
attribute (the trainable semantic vector) of the datasets and the categories, which improves a 
machine's efficiency in j denri fejilg or ner ving data, and difTerentintin e between r mr ^ ^ 
Documents or files classified in the same category are likely to be related to one another and may be 
retrieved together, while documents or datasets in different categories are unlikely to be related. 
Accordingly, the claims describe concepts that create a "useful. non-aW,,, ^ ana , 0g0Us t0 
the method of adding a data field with information on long distance provider,, which the Federal 
Circuit found to be a "useful, non-abstract result that facilitates MmsM billing of long-distance 
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calls," which «M[s] comfortably within the broad scope of patentable subject matter under §101" 
Emphasis added. AT&T Corp. K Excel Comnmnications, Inc., Ill F.3d 1352, 50 USPQ2d 1447 
(Fed. Cir. 1999). 

For reasons outlined above, it is submitted that all of the rejections under 35 U.S.C. § 101 
should be withdrawn. - 

Incidentally, Applicants note that paragraph 5 of the office action asserted that claims 63-64 
and 69-70 "are, at best, functional descriptive material per se .» However, according to MPEP 
2173.05(g), " [t ]here is nothing inherently wrong with defining some part of an invention in 
functional terms. Functional language does not, in and of itself render a claim inappropriate. A 
functional limitation must be evaluated and considered, just like any other Umitation of the claim." 
It is respectfully requested that the rejection of claims 63-64 and 69-70 is overcome. 

Furthermore, the Office Action asserted that the term "machine-executed" in the claims does 
not have a corresponding definition in the specification. By this proposed amendment, the term 
"machine-executed" is amended to "computer-executed." The specification clear describes "a 
computer system upon which an embodiment of the invention may be implemented." See page 10, 
lines 9-10 and Fig. T. 
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PROPOSED AMFiyp pMENTS TO THE CLAIMS ; 

Claims 1-40 (Cancelled) 

(Currently Amended) A method of classifying new datasets within a predetermined 
number of categories based on assignment of a plurality of sample datasets to each category, the 
method comprising the maehiitecomgu^-executed steps: 

constructing a tiainable semantic vector for each sample dataset relative to the 
predetermined categories in a multi-dimensional semantic space; 

constructing a trainable semantic vector for each category based on the trainable semantic 
vectors for the sample datasets; 

receiving a new dataset; 

constructing a trainable semantic vector for the new dataset; 

determining a distance between the trainable semantic vector for the new dataset and the 
trainable semantic vector of each category; and 

classifying the new dataset within the category whose tramable semantic vector has the 
shortest distance to the trainable semantic vector of the new dataset; 

wherein: 

the new data set or each of the sample data sets includes at least one data point; 

each data point corresponds to at least one of a word, a phrase, a sentence, a color, a 
typography, a punctuation, a picture, and a character string; and 

the trainable semantic vector for each sample data set or the new dataset is constructed by 
performing the steps of: 

for each data point, identifying a relationship between each data point and predetermined 
categories corresponding to dimensions in the semantic space; 
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determining the significance of each datapoint with respect to the predetermined categories; 

constructing a trainable semantic vector for each datapoint, wherein each trainable semantic 
vector has dimensions equal to the number of predetermined categories and represents the relative 
strength of its corresponding data point with respect to each of the predetermined categories; and 

combining the trainable semantic vector for each of the at least one data point to form the 
semantic vector of the sample dataset or the new dataset. 

42. (Currently Amended) The method of Claim [[41]]4L wherein the datasets 
correspond to documents. 

43. (Currently Amended) The method of Claim [[41 ]]4L wherein the datasets 
correspond to email messages and the categories correspond to frequently asked questions with 
substantially static responses. 

44. (Original) The method of Claim 41, further comprising the steps: 
detecting when a prescribed number of new datasets has been classified; and 
updating the trainable semantic vectors for each of the categoris 



les. 



45. (Original) The method of Claim 44, wherein the step of updating comprises the step 
of reconstructing trainable semantic vectors for each category based on the trainable semantic 
vectors for the sample datasets and the trainable semantic vectors for the new datasets added to each 
category. 

(Currently Amended) A method of classifying, new datasets within a predetermined 
number of categories based on assignment of a plurality of sample datasets to each category, the 
method comprising the ma<^compujer-executed steps: 
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constructing a trainable semantic vector for each sample dataset relative to the 
predetermined categories in a multi-dimensional semantic space; 
receiving a new dataset; 

constructing a trainable semantic vector for the new dataset; 

identifying a select number of sample datasets whose trainable semantic vectors are closest 
in distance to the trainable semantic vector for the new dataset; and 

classifying the new dataset in the category containing the greatest number of the select 
sample datasets; 

wherein: 

the new data set or each of the sample data sets includes at least one data point; 
each data point corresponds to at least one of a word, a phrase, a sentence, a color, a 
typography, a punctuation, a picture, and a character string; and 

the trainable semantic vector for each sample data set or the new dataset is constructed by 
performing the steps of; 

for each data point, identifying a relationship between each data point and predetermined 
categories corresponding to dimensions in the semantic space; 

determining the significance of each data point with respect to the predetermined categories; 

constructing a trainable semantic vector for each data point, wherein each trainable semantic 
vector has dimensions equal to the number of predetermined categories and represents the relative 
strength of its corresponding data point with respect to each of the predetermined categories; and 

combining the trainable semantic vector for each of the at least one data point to form the 
semantic vector of the sample dataset or the new dataset. 
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47. • (Currently Amended) The method of Claim [[46]]46, wherein the datasets 
correspond to documents. 

48. (Currently Amended) The method of Claim [[46]]44 wherein the datasets 
correspond to email messages and the categories correspond to frequently asked questions with 
substantially static responses. [[49.]] 

49. (Original) The method of Claim 46, further comprising the steps: 
detecting when a prescribed number of new datasets has been classified; and 
adding the new datasets to the set of sample datasets. 



(Currently Amended) A method of classifying new datasets within a predetermined 
number of categories, the method comprising the maeWflfiCfimpuier-executed steps: 
receiving a new dataset; 

constructing a trainable semantic vector for the new dataset, where the dimensions of the 
trainable semantic vector correspond to the predetermined number of categories; 

classifying the dataset in the category whose corresponding dimension in the trainable 
semantic vector has the largest value; 

wherein: 

the new data set includes one or more data point; 

each data point corresponds to at least one of.a word, a phrase, a sentence, a color, a 

typography, a punctuation, a picture, and a character string; and 

the trainable semantic vector for the new dataset is constructed by performing the steps of: 
for each data point within the new dataset, identifying a relationship between each data point 

and predetermined categories corresponding to dimensions in the semantic space; 
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detaining the significance of each data point with respect to the predetermined categories; 

constructing a trainable semantic vector for each data point, wherein each trainable semantic 
vector has dimensions equal to the number of predetermined categories and represents the relative 
strength of its corresponding data point with respect to each of the predetermined categories; and 

combining the trainable semantic vector for each data point to form the semantic vector of 
the new dataset. 

51. (Currently Amended) The method of Claim [[50JM wherein the datasets 
correspond to documents. 

52. (Currently Amended) The method of Claim [[50]]50, wherein the datasets 
correspond to email messages and the categories correspond to frequently asked questions with 
substantially static responses. 

Claims 53-62 (Cancelled) 

^ (Previously Amended) A system for classifying new datasets within a predetermined 
number of categories based on assignment of a plurality of sample datasets to each category, the 
system comprising: 

a f computSf 'configure to: 

construct a trainable semantic vector for each sample dataset relative to the 
predetermined categories in a multi-dimensional semantic space; 

construct a trainable semantic vector for each category based on the trainable 
semantic vectors for the sample datasets, 
receive a new dataset; 

construct a trainable semantic vector for the new dataset; 
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determine a distance between the trainable semantic vector for the new dataset and 
the trainable semantic vector of each category; and 

classify the new dataset within the category whose trainable semantic vector has the 
shortest distance to the trainable semantic vector of the new dataset; 
wherein: 

the new data set or each of the sample data sets includes at least one data point; 
each data point corresponds to at least one of a word, a phrase, a sentence, a color, a 
typography, a punctuation, a picture, and a character string; and 

the trainable semantic vector for each sample data set or the new dataset is constructed by 
performing the steps of: 

for each data point, identifying a relationship between each data point and predetermined 
categories corresponding to dimensions in the semantic space; 

determining the significance of each data point with respect to the predetermined categories; 

constructing a trainable semantic vector for each data point, wherein each trainable semantic 
vector has dimensions equal to the number of predetermined categories and represents the relative 
strength of its corresponding data point with respect to each of the predetermined categories; and 

combining the trainable semantic vector tor each of the at least one data point to form the 
semantic vector of the sample dataset or the new dataset. 




(Previously Amended) A system fi?r classifying new datasets within a predetermined 
numberof categories based on assignment of a plurality of sample datasets to each category, the 
system comprising: 

a computer configured to: 
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construct a trainable semantic vector for each sample dataset relative to the 
predetermined categories in a multi-dimensional semantic space; 
receive a new dataset; 

construct a trainable semantic vector for the new dataset; 

identify a select number of sample datasets whose trainable semantic vectors are 
closest in distance to the trainable semantic vector for the new dataset; and 

classify the new dataset in the category containing the greatest number of the select 
sample datasets; 

wherein: 

the new data set or each of the sample data sets includes at least one data point; 
each data point corresponds to at least one of a word, a phrase, a sentence, a color, a 

typography, a punctuation, a picture, and a character string; and 

the trainable semantic vector for each sample data set or the new dataset is constructed by 

performing the steps of- 
fer each data point, identifying a relationship between each data point and predetermined 

categories corresponding to dimensions in the semantic space; 

determining the significance of each data point with respect to the predetermined categories; 

and 

constructing a trainable semantic vector for each data point, wherein each trainable semantic 
vector has dimensions equal to the number of predetermined categories and represents the relative 
strength of its corresponding data point with respect to each of the predetermined categories; and 

combining the trainable semantic vector for each of the at least one data point to form the 
semantic vector of the sample dataset or the new dataset. 
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Claims 65-68 (Cancelled) 

(3) (Currently amended) A computer-readable medium carrying one or more sequences 
of instructions for classifying new datasets within a predetermined number of categories based on 
assignment of a plurality of sample datasets to each category, wherein execution of the one or more 
sequences of instructions by one or more processors causes the one or more processors to perform 
the maehtn ecomputer -executed s^ps of: 

constructing a trainable semantic vector for each sample dataset relative to the 
predetermined categories in a multi -dimensional semantic space; 

constructing a trainable semantic vector for each category based on the trainable semantic 
vectors for the sample datasets; 
receiving a new dataset; 

constructing a trainable semantic vector for the new dataset; 

determining a distance between the trainable semantic vector for the new dataset and the 
trainable semantic vector of each category; and 

classifying the new dataset within the category whose trainable semantic vector has the 
shortest distance to the trainable semantic vector of the new dataset; 

wherein: 

the new data set or each of the sample data sets includes at least one data point; 
each data point corresponds to at least one of a word, a phrase, a sentence, a color, a 
typography, a punctuation, a picture, and a character string; and 

the trainable semantic vector for each sample data set or the new dataset is constructed by 
performing the steps of: 
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for each data point, identifying a relationship between each data point and predetermined 
categories corresponding to dimensions in the semantic space; 

determining the significance of each data point with respect to the predetermined categories; 

constructing a trainable semantic vector for each data point, wherein each trainable semantic 
vector has dimensions equal to the number of predetermined categories and represents the relative 
strength of its corresponding data point with respect to each of the predetermined categories; and 

combining the trainable semantic vector for each of the at least one data point to form the 
semantic vector of the sample dataset or the new dataset. 




jay (Previously Amended) A- computer-readable medium carrying one or more 
sequences of instructions for classifying new datasets within a predetermined number of categories 
based on assignment of a plurality of sample datasets to each category, wherein execution of the one 
or more sequences of instructions by one or more processors causes the one or more processors to 
perform the steps of: 

constructing a trainable semantic vector for each sample dataset relative to the 
predetermined categories in a mutti-dimensiona! semantic space; 
receiving a new dataset; 

constructing a trainable semantic vector for the new dataset; 

identifying a select number of select datasets whose trainable semantic vectors are closest in 
distance to the trainable semantic vector for the new dataset; and 

classifying the new dataset in the category containing the greatest number of the select 

dataset; 
wherein: 

the new data set or each of the sample data sets includes at least one data point; 
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each data point corresponds to at least one of a word, a phrase, a sentence, a color, a 
typography, a punctuation, a picture, and a character string; and 

the trainable semantic vector for each sample data set or the new dataset is constructed by 
performing the steps of: 

for each data point, identifying a relationship between each data point and predetermined 
categories corresponding to dimensions in the semantic space; 

determining the significance of each data point with respect to the predetermined categories; 

constructing a trainable semantic vector for each data point, wherein each trainable semantic 
vector has dimensions equal to the number of predetermined categories and represents the relative 
strength of its corresponding data point with respect to each of the predetermined categories; and 

combining the trainable semantic vector for each of the at least one data point to form the 
semantic vector of the sample dataset or the new dataset. 

Claims 71-78 (Cancelled) 
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Respectfully submitted, 
McOERMOTT WILL & EMERY LLP 

Wei-Chen Nicholas Chen 
Registration No. 56,665 
650-813-5092 
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